Skip to content

Conversation

@Dooruk
Copy link
Collaborator

@Dooruk Dooruk commented Dec 22, 2025

I made this minor change a while back but forgot to create a PR, not sure if this is the way we want to go about this. I see Travis added a note:

  ! TODO the jacobians should really be stored in atlas fields, but
  !  I didn't feel like dealing with all that refactoring

Is that something I could tackle with reasonable amount of effort? I'm not familiar with the atlas layer.. I would like to learn though so I could use some guidance on that?

This unfortunately doesn't write another variable to the same file. While looking for solutions to write two steric jacobian fields to the same file, gpt suggested using fms2_io_mod. I see this is used in fv3-jedi I'm wondering if fms2_io is something we want to use in SOCA? I also saw this comment from last year.

Relevant issue #1106

Todo:

  • edit some test ymls

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds functionality to optionally write dynamic height Jacobians (Kst and Ksshts) to NetCDF files during the balance operator setup phase. The implementation introduces a new utility module for writing Jacobians using FMS I/O routines.

  • Introduces soca_write_jacobian_mod module with a function to write 3D Jacobian arrays to NetCDF files
  • Adds optional Jacobian output for Kst and Ksshts balance operators controlled by configuration
  • Enables diagnostic output of computed Jacobian fields for analysis and debugging

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/soca/LinearVariableChange/Balance/soca_write_jacobian_mod.F90 New module providing write_jacobian_to_netcdf function using FMS I/O routines
src/soca/LinearVariableChange/Balance/soca_balance_mod.F90 Adds optional Jacobian output blocks for Kst and Ksshts operators with configuration-based file writing
src/soca/LinearVariableChange/Balance/CMakeLists.txt Registers new soca_write_jacobian_mod.F90 source file in build system

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +207 to +208
call write_jacobian_to_netcdf(self%ksshts%kssht, geom, str, "kssht_jacobian")
call write_jacobian_to_netcdf(self%ksshts%ksshs, geom, str, "ksshs_jacobian")
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing both kssht_jacobian and ksshs_jacobian to the same file by calling write_jacobian_to_netcdf twice with the same filename will likely result in the second call overwriting the first, or potentially causing errors. The FMS write_data function may not support appending multiple variables to the same file in this manner. Consider either using different filenames for each Jacobian or implementing a mechanism to write both variables in a single call.

Suggested change
call write_jacobian_to_netcdf(self%ksshts%kssht, geom, str, "kssht_jacobian")
call write_jacobian_to_netcdf(self%ksshts%ksshs, geom, str, "ksshs_jacobian")
filename = str
call write_jacobian_to_netcdf(self%ksshts%kssht, geom, filename, "kssht_jacobian")
filename = trim(str)//"_ksshs"
call write_jacobian_to_netcdf(self%ksshts%ksshs, geom, filename, "ksshs_jacobian")

Copilot uses AI. Check for mistakes.

! declarations related to the dynamic height Jacobians
character(len=:), allocatable :: filename
character(len=:), allocatable :: str
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'str' is declared but never deallocated. Since it's an allocatable character string that gets assigned values from f_conf%get_or_die, it should be deallocated after use to prevent memory leaks, or the allocation should be handled more carefully within the scope where it's used.

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +33
call fms_io_init()
call set_domain(geom%Domain%mpp_domain)

! Write the jacobian data from all PEs
call write_data(filename, varname, jacobian, geom%Domain%mpp_domain)

call fms_io_exit()
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling fms_io_init and fms_io_exit on every invocation can be problematic. In soca_geom_mod, fms_io_init is guarded by a counter check to ensure it's only called once per geometry initialization. Repeatedly initializing and finalizing the FMS I/O subsystem for each Jacobian write could lead to resource management issues or conflicts, especially if multiple Jacobians are written in sequence. Consider checking if FMS I/O is already initialized or coordinating with the existing initialization in soca_geom_mod.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <[email protected]>
@shlyaeva
Copy link
Collaborator

shlyaeva commented Jan 7, 2026

Is that something I could tackle with reasonable amount of effort? I'm not familiar with the atlas layer.. I would like to learn though so I could use some guidance on that?

I don't have a good estimate as I don't know this part of the code well enough myself. I think it's a good idea though, and I would be happy to look into it together and provide some guidance, let me know!

fms2_io_mod. I see this is used in fv3-jedi I'm wondering if fms2_io is something we want to use in SOCA? I also saw https://github.com/JCSDA-internal/fv3-jedi-linearmodel/issues/18#issuecomment-2403328653 from last year.

For the comment, we've resolved this, I think, with #1122 (and a follow-up bugfix). But yes, we may want to change our I/O. I'm not familiar enough with fms_io or fms2_io and personally currently feeling lukewarm about getting into it 😆 I did just spend a few days looking at refactoring I/O completely (not using fms io, at least for state/increment I/O, just using direct netcdf read/writes), but I have temporarily given up on that.

@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 12, 2026

Thanks @shlyaeva! I'm pinging @guillaumevernieres.

Like I mentioned, this particular PR is pretty straightforward but for instance I can't write sshS and sshT to the same file, or add layer thickness to simplify my plotting later. Do i need to describe jacobians as fieldsets first? Also we will need this for geostrophic balance too (in the long term), #1172 I see there is work underway?

Sounds like neither switching to FMS2 or Atlas methods are straightforward, especially for my JEDI code knowledge level. However would we get significant memory gains for SOCA this way? I would like to understand how to use Atlas and the best way would be by doing something for me but it may take a long time 👴🏼

@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 12, 2026

Also support for FMS(1) is dropped in this version?:

https://github.com/NOAA-GFDL/FMS/releases/tag/2025.03

And this uses fms_to_init which is problematic and Anna made a PR addressing that before.. I guess Travis had some thoughts on this too:

#1123 (comment)

@shlyaeva
Copy link
Collaborator

Incidentally, I think I may have a branch switching soca fields fms_io to fms2_io (thanks to copilot and Dave who did the work in fv3-jedi). It is right now tangled with a bunch of other development at various stages, and I haven't tested outside of soca tests yet, but I'll try to cleanup and issue a draft PR in a testable form this week.

@shlyaeva
Copy link
Collaborator

However would we get significant memory gains for SOCA this way? I would like to understand how to use Atlas and the best way would be by doing something for me but it may take a long time.

happy to talk through this, let me know! I think the jacobian task may be a good case for understanding how to use atlas, and there's no rush in getting it done so low risk too!

@shlyaeva
Copy link
Collaborator

Incidentally, I think I may have a branch switching soca fields fms_io to fms2_io (thanks to copilot and Dave who did the work in fv3-jedi). It is right now tangled with a bunch of other development at various stages, and I haven't tested outside of soca tests yet, but I'll try to cleanup and issue a draft PR in a testable form this week.

My first high-res high I/O LETKF test is disappointing: I/O in my branch with fms2_io is 2x slower than our already slow I/O on develop, so I think I might have to dig into the code before it's worth looking at it or testing (but if anyone is interested, branch feature/rewrite_soca2cice)

@Dooruk
Copy link
Collaborator Author

Dooruk commented Jan 16, 2026

I asked gemini your commit, I fully agree 😝 :

In the commit c85ceaaa here is an analysis of the current implementation and how it can be optimized:

1. Potential Speed/Performance Issues

  • Redundant Metadata Lookups in Loops:
    The soca_register_field helper is called inside a loop over all fields. Inside this helper, you perform multiple calls to get_variable_num_dimensions, get_variable_dimension_names, and manual loops over fileobj%nx/fileobj%ny.
    • Impact: For a large number of variables (e.g., in high-resolution runs or complex chemistry), these metadata operations can add measurable overhead. FMS2 metadata operations are generally fast, but doing them 100+ times per file open is inefficient.
  • Repeated open_file / close_file:
    In soca_fields_read, you open the ice file once to check for variable existence and then potentially open it (or another file) again for the actual read.
    • Impact: File system operations (open/close) are expensive, especially on parallel file systems like Lustre.
  • String Manipulations:
    There is a lot of trim() and string concatenation inside the registration loop. In Fortran, frequent string operations in a hot loop can be a minor bottleneck.

2. Suggestions for Improvement

A. Consolidate Axis Registration

Instead of checking and registering axes for every single field, register your standard horizontal (and common vertical) axes once immediately after opening the file.

  • Suggestion: Create a soca_register_axes(fileobj) routine that ensures xaxis, yaxis, and Time are set up. Then soca_register_field only needs to handle the actual field registration.

B. Optimize soca_register_field

  • Use is_dimension_registered consistently: In the "Writing" section of your helper, you manually loop over fileobj%nx. It's cleaner and safer to use the public FMS2 API is_dimension_registered where possible.
  • Handle the Vertical Dimension on Read:
    In the reading logic:
    if ( .not. is_dimension_registered(fileobj, trim(dim_names(1))) ) ... ! x
    if ( .not. is_dimension_registered(fileobj, trim(dim_names(2))) ) ... ! y
    ! Missing z-axis registration?
    If you are reading a 3D field, you should also register the 3rd dimension as an axis to ensure FMS2 correctly maps the vertical levels.

C. Avoid Multiple Opens

In soca_fields_read, try to keep the file handle open if you know you're about to read from it.

  • Suggestion: Modify the logic so that open_file happens once, then you perform your variable_exists checks, then the registrations, then the read_restart, and finally close_file.

D. Batch Field Registration

FMS2 is designed to "record" all registrations and then perform a collective I/O operation. Your current structure follows this well, but ensure that all register_restart_field calls happen before the single read_restart or write_restart call to maximize I/O throughput.

3. Code Quality Observations

  • Hardcoded Time Padding:
    In the writing section, you have (/ xdim_name, ydim_name, zdim_name, 'Time '/).
    • Suggestion: Use trim('Time') or just 'Time' without the spaces. FMS2 handles string trimming internally, and the hardcoded 8-character string might look confusing to other developers.
  • Extension Handling:
    You added .nc to soca_genfilename. Ensure that this doesn't conflict with FMS2's internal logic if dont_add_res_to_filename is set to .false. in other parts of the code, as FMS might try to append suffixes itself.

Summary of Suggested Refactoring for soca_register_field

! Logic for writing could be simplified:
if (.not. is_dimension_registered(fileobj, xdim_name)) then
    call register_axis(fileobj, xdim_name, 'x', domain_position=center)
    call register_field(fileobj, xdim_name, 'double', (/ xdim_name /))
    ...
end if
! Repeat for y, z, and Time...

By moving these checks out of the field loop, you'll see better performance in the "setup" phase of your I/O operations.

@shlyaeva
Copy link
Collaborator

@Dooruk oh neat! do you mind committing to the branch if you have those changes? did you run comparison with develop, is it reaching at least the same (not great) performance after those, or maybe even getting better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants